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A database tool comprises a computer-implemented method 
for extracting systematic information from one or more 
databases that apparendy only comprise data noise or seem- 
ingly unrelated data items. Criminal and community rela- 
tionships that exist amongst telephone and internet subscrib- 
ers are extracted from large telephone databases derived 
from wire taps and/or long distance telephone records. A 
telephone records file is used that comprises a caller's 
telephone number, dialed telephone numbers, and the time. 
A second database comprises a list of telephone numbers 
which are suspicious for some reason, and a descriptor as to 
why each such telephone number is suspicious. A third 
database includes biographical data about the telephone 
subscribers, such as name, address, and other facts. The 
unique telephone nimibers in the database are identified. 
Matches between the first and second databases are made. 
Related components are grouped into clusters. The valence 
for each telephone number is computed. The relational 
distances between each pair of telephone numbers in a 
cluster are determined. The telephone numbers are repre- 
sented as points in the x,y-plane of a display, with the 
distance between the points representing the strength of the 
relationship based on call frequency and other criteria. An 
interactive interface is provided for the user to click on items 
to see the background information associated with each 
point. 

11 Claims^ 1 Drawing Sheet 
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DATABASE ORIGAMI 

RELATED APPUCAnONS 

This application claims the benefit of two U.S. Provi- 
sional Patent i^jplications, a first being Ser. No. 60/027,893, 
filed Oct. 7, 1996, titled DAIABASE BROWSER, and a 
second being Ser. No. 60/036,689, filed Jaa 31, 1997, and 
tiUed DATABASE ORIGAMI. Such Provisional ii^plica- 
tions are incoiporated herein by reference. 

BACKGROUND OF THE PRESENT 
INVENTION 

1. Field of the Present Invention 

The present invention is called DATABASE ORIGAMI 
because of its ability to unfold and clarify complex, subtle 
and difi&cult to uncover relationships and other information 
hidden in large databases. The present invention relates 
generally to database browsers and crawlers and more 
particularly to computer-implemented methods and systems 
for extracting useful systematic information for law 
enforcement, the intelligence community, and other organi- 
zations with complex analytical requirements from tele- 
phone and internet subscriber records, textual, and non- 
textual databases. 

2. Description of Related Art 

In 1995, an information management software company, 
Oracle Corporation, introduced a Microsoft Windows-based 
software tool designed to help law enforcement investigators 
more effectively manage and solve cases. Such was mar- 
keted as the special investigative unit support system 
(SIUSS). Information management technology is used to 
provide insights into criminal activity and reduces the time 
needed to bring cases to their successfiil resolution. The 
Oracle SIUSS collects, stores and analyzes case intelligence 
information related to complex conspiracies, violent crimes, 
drug traflScking, and other major cases. The tool combines 
conventional analytical techniques with job-specific infor- 
mation collection and lead generation analysis. Inputs can be 
received from various investigative sources, e.g., surveil- 
lance teams, forensics experts, wire room operators, citizen 
tips. The computed conclusions are provided as case leads 
and made available to agency management, analysts and 
investigators. 

Some prior art law enforcement systems simply gather 
and store factual information, e.g., names, birthdates, and 
time of a telephone call. The Oracle SIUSS attempts to 
develop leads the way investigators do, by starting with 
known facts and combining them for further insight and 
leads. SIUSS is based on a conventional relational database 
management system. Criminal patterns are identified by 
linking subjects, vehicles, locations, businesses and other 
entities, within a case or among several cases. Information 
management is provided to users for investigative 
intelligence, telephone information, assets, financial data, 
arrests, seizures, credit card data, surveillance, mail covers, 
trash pickup and incidents. The telephone information can 
include toll, pen/DNR, and Title IB data. Information can be 
cross-referenced to uncover otherwise obscure and non- 
obvious relationships. 

Database analysis software is now being used by the 
Federal Bureau of Investigation (FBI), Immigration and 
Naturalization Service (INS), Department of Justice (DOJ), 
US Customs, Alcohol Tobacco and Firearms (ATF), state 
departments of public safety, and many other agencies at all 
government levels. Case information can be shared or kept 
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separate to any degree desired, depending on the needs of 
the investigators. Information bits are gathered firom the 
field in hundreds of bits and pieces at different times and 
places, and submitted to an automated link analysis. The 

5 Oracle SIUSS uses pattem analysis to find information in the 
timing and sequence of phone calls made by an investigadon 
target. Insights can be developed into how the target and his 
associates work together in a conspiracy. A conspiracy index 
is created according to the relationships of calls involving 
the target and others based on their phone numbers. Such 
relationships are proportional to their mutual involvement in 
a conspiracy. Secure databases, networking and encryption 
technologies are used to control the flow and accessibility of 
intelligence data outside the supplying agency. An Oracle 
SIUSS configuration can include a Microsoft WINDOWS 
operating system hosted on a personal computer, and such 
can be connected to virtually any type of file server. 

The prior art includes many so-called analytical computer 
programs for law enforcement. Most only store and retrieve 
data. Others make "analytical" graphics from associations 
the user must identify first. In general, investigative analysis 
looks for patterns, associations and profiles, and such infor- 
mation can help steer an investigator to previously unknown 
criminal activity. Computer programs are now being used to 

25 discover in seconds what used to take days using index 
cards. 

Telephone activity analysis involves the identification of 
illicit operations, and supervisors and their subordinates 
through telephone profiles. Conventional telephone activity 

30 reports display the notes and plant numbers related to each 
telephone number. If a number occurs in other plant or 
subject files, which plants and subjects are aimounced auto- 
matically. Financial activity can be combined, sorted and 
key-word searched for one or more subjects. An account 

35 number that appears in anodier financial plant is also auto- 
matically aimounced. For example, common money laun- 
dering methods often display various database indicia. Uni- 
versal pattem and association searches are conventionally 
used to combine telephone, surveillance, financial and mail 

40 activity, and then to look for any systematic patterns and 
links. Relational links between a subject, a group, a 
business, etc., are displayed. 

The Institute for Intergovernmental Research 
(Tallahassee, FL), markets a research specialized software 

45 for law enforcement agencies under the name CRIMINAL 
INTELLIGENCE SYSTEM FOR MICROCOMPUTERS 
(CIS). Law enforcement agencies are supposed to be able to 
organize and access information on individual suspects or 
suspect organizations in an easy to follow format. The CIS 

50 program can accept up to seventy-two elements of informa- 
tion on individuals, or forty elements for organizations. CIS 
groups similar data together for both onscreen viewing and 
printed reports. Information about individuals is categorized 
into personal information, alias/moniker, associates, crimi- 

55 nal activity, and vehicle information. Organizations are 
categorized into organization information, criminal activity, 
and vehicle information. Such organizations can comprise 
either a business or group suspected of criminal activity. CIS 
allows access, modifications, additions, and printing of the 

60 information. Full or partial descriptions can be entered for 
most searches, and up to nine elements can be combined to 
create a personahzed search. CIS database records can be 
transferred between microcomputers, for record sharing 
within a department or between agencies. 

65 The importance of this invention is that the concept of a 
map of the datapoints can be extended to other kinds of 
databases. For example, in a different embodiment of the 
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iDvenlion, the files from a seized computer can be 
datapoiots. A metric is defined and the distance between two 
files is computed. This leads to the construction of the map 
of datapoints. This concept can also be applied to commer- 
cial applications such as transportation, retail sales and 
marketing. 

SUMMARY OF THE PRESENT INVENTION 

An object of the present invention is to provide a database 
tool for extracting systematic information from very large 
commimication connection log databases and business 
inventory databases. 

A further object of the present invention is to provide a 
database tool that displays relationships between database 
elements as proportional distances between clickable hyper- 
text points in a two or three dimensional graphic space. 

Another object of the present invention is to provide a law 
enforcement tool for analyzing large databases for obscure 
relationships amongst database entries. 

A still further object of the present invention is to provide 
an interactive database browser that automatically displays 
relationships from the point of view of particular database 
elements, with the closeness of such relationships displayed 
as proportional distances between clickable hypertext points 
in a two or three dimensional graphic space. 

Briefly, a database tool embodiment of the present inven- 
tion comprises a computer-implemented method for extract- 
ing systematic information from one or more databases that 
apparently only comprise data noise or seemingly unrelated 
data items. Criminal and community relationships that exist 
amongst telephone subscribers are extracted from large 
telephone databases derived from wire taps and/or long 
distance telephone records. A telephone records file is used 
that comprises a caller's telephone number, dialed telephone 
numbers, and the lime and date. A second database com- 
prises a list of telephone numbers which are suspicious for 
some reason, and a descriptor as to why each such telephone 
number is suspicious. A third database includes biographical 
data about the telephone subscribers, such as name, address, 
and other facts. The imique telephone numbers in the 
database are identified. Matches between the first and sec- 
ond databases are made. Related components are grouped 
into clusters. The valence for each telephone number is 
computed. The relational distances between each pair of 
telephone numbers in a cluster are determined. The tele- 
phone numbers are represented as points in the x,y-plane of 
a display, with the distance between the points representing 
the strength of the relationship based on call frequency and 
other criteria. An interactive interface is provided for the 
user to click on items to see the backgroimd information 
associated with each point. 

An advantage of the present invention is that a database 
tool is provided that is capable of detecting the evasive 
techniques used by criminals. Such evasive techniques can 
include the purchasing of a cellular phone, using it for a 
week, and then discarding it. 

Another advantage of the present invention is a database 
tool is provided that can quickly extract useful information 
from exceedingly large databases. This embodiment of the 
present invention enables the analysis of large databases of 
textual and non-textual data such as internet messages, 
reference materials, computer files and a wide variety of 
other large data systems. A file is created which allows the 
comparison of documents for similarities in content. Words 
in whatever language or arbitrary character strings are 
identified and scored for use in measuring relative distances 
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among data elements. Documents which are most similar are 
positioned as closely as possible in the map illustrating their 
relationships. 

BRIEF DESCRIPTION OF THE DRAWING 

FIG. 1 is a functional block diagram of a database browser 
embodiment of the present invention. 

DETAILED DESCRIPTION OF THE PRESENT 
INVENTION 

FIG. 1 represents a database tool embodiment of the 
present invention, referred to herein by the general reference 
numeral 10. The database tool 10 comprises a connection 
database 12, which for example can include connection logs 
and records of individual communication network subscrib- 
ers with a corresponding network address. Such communi- 
cation networks include, but are not limited to, the public 
switched telephone nelworis (PSTN), the e-mail network, 
and the intemct's world wide web (WWW). The subscriber 
information obtainable from such networks includes busi- 
ness and residential telephone subscribers' numbers, e-mail 
account addresses, and TCP/IP internet addresses and packet 
routing header data. In a prototype embodiment of the 
present invention, the connection database 12 comprises a 
disk file, called CHRONO.DAT, for example, and stored 
telephone call connection records. Each data element in 
CHRONO-DAT includes caller telephone numbers, callee 
telephone numbers, and the time and date of the connection 
between each caller and callee. 

A surveillance database 14 includes data entries of net- 
work addresses under a user's scrutiny. For example, the 
surveillance database 14, called SUSPECT.DAT, comprises 
a list of telephone numbers which are suspicious in nature 
and a descriptor as to why the telephone number is suspi- 
cious. 

Embodiments of the present invention may include or be 
hosted on conventional personal computer (PC) systems and 
workstation intranets and internets. For example, IBM- 
compatible PC's with the Microsoft WINDOWS operating 
system and Apple Computer MACINTOSH computers are 
particularly useful. In such cases, the preferred embodiments 
of the present invention will be loaded on such PC's via 
removable disk media or network downloads. The host 
operating system and hardware is then used by the embodi- 
ments of the present invention for execution, storage, and 
input/output, 

A first computer program 16 is used for identifying each 
unique communication network subscriber in the database 
12. U.S Provisional Patent Application, Ser. No. 60/036,689, 
filed Jan. 31, 1997, titled DATABASE ORIGAMI, incorpo- 
rated herein by referene, generates two files of unique sets 
of telephone numbers contained in the database 12 
(CHRONO.DAI). A first of these files, NUMFREQ .DAT for 
example, comprises the unique telephone numbers sorted in 
numerical order. It also comprises a count of the number of 
calls in which that phone number was either the caller or 
callee, e.g., the frequency. A second file, called FREQORD- 
.DAT for example, comprises the same Ust of telephone 
numbers and frequencies, but the data is sorted by frequency. 

A second computer program 18 is used for determining 
each of a plurality of network addresses in the second 
database of network addresses that are also included in the 
first database. U.S Provisional Patent Application, Ser. No. 
60/036,689, filed Jan. 31, 1997, titled DATABASE 
ORIGAMI, incorporated herein by referene, compares tele- 
phone numbers in the database 14 (SUSPECT-DAT) with 
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those in database 12 (CHRONCDAI^. An output file is volume relative to the distance computed by the fifth corn- 
constructed, called SUSPECr.VEC for example, and com- puter program. Each telephone number is mapped into a 
prises the telephone numbers common to both databases 12 unique point in an x,y-plane or an x,y,z-volume. Such 
and 14. procedure is computationally intensive, and its particular 
A third and fourth computer program 20 and 22 are used 5 implementation is critical to the present invention. The 
for parsing the network addresses into cormccted component preferred mathematics embodied in program 28 or for these 
clusters. U.S Provisional Patent Application, Ser. No. calculations are described in U.S Provisional Patent 
60/036,689, filed Jan. 31, 1997, titled DATABASE Application, Ser. No. 60/036,689, filed Jan. 31, 1997, titled 
ORIGAMI, incorporated herein by referene, finds any asso- DATABASE ORIGAMI, incorporated herein by referene. 
ciations that exist, e.g., between the telephone numbers, in lo An interactive interface computer program 30, called 
other words which numbers were connected to what other DEPICT in U.S Provisional Patent Application, Ser. No. 
numbers in the list of telephone numbers. U.S Provisional 60/036,689, filed Jan. 31, 1997, titled DATABASE 
PatentApplication, Ser. No. 60/036,689, filed Jan. 31, 1997, ORIGAMI, incorporated herein by referene. is used for 
titled DATABASE ORIGAMI, incorporated herein by plotting and displaying the mapping plane or volume to a 
referene, takes the results of the third computer program 20 15 user. The interactive interface 30 provides for a user to be 
(UNIQDBLE), and parses the list of telephone numbers into able manipulate each of the connected databases, and the 
connected component clusters. Each connection address in a first through seventh computer programs 16-28, in order to 
cluster must have evidenced at least one communication extract and present useful information to a user in an easy to 
connection with at least one other connection address in the grasp format. 

same connected component cluster, and to no other oonnec- 20 A biographical database 32, called MONTBIO.DAT for 

tion address in any other connected component cluster. Thus example, includes biographical data related to the persons or 

each connection address in a cluster has made at least one products associated with eadi connection address in the 

connection with a second connection address, and that surveillance database 14. 

second connection address has made at least one connection A personal computer host 34 includes a bard disk data 

with a third connection address, and so on. All such con- 25 memory storage for large databases and a microprocessor 

nected addresses are included in one cluster. Other clusters execution unit and support peripherals for downloading and 

have no identified intersection with any other cluster, running disk operating systems and software application 

A fifth computer program 24 is used for computing a programs, 

valence value for each network address. The valence value A modem 36 provides two-way data communication to 

represents the total number of other connection addresses 30 other systems via dialed-up telephone lines or TCP/IP inter- 

with which a particular connection address communicates, net connection for internet and other networks. Information 

either as a caller or a callee. Alternative embodiments of the may be downloaded fi-om or uploaded to data sources and 

present invention attach some value to whether the connec- the databases 12, 14, and 32 via the modem 36. 

tion address is a caller or a callee. A pair of output files are Prior art programs and methods extract useful information 

created by program 24. A first such file is named NUM- ^5 from a database of telephone numbers using conventional 

VAL.DAr for example, and comprises each telephone num- liok analysis to visualize any underlying relationships 

bcr and its valence, sorted in numerical order by telephone among the data elements. The present invention avoids the 

number. The second file, named FREQVAL.DAr, has the clutter generated by traditional Imk analysis, and yet 

same data, but is sorted by valence. achieves its benefits without confusing the user with Byz- 

As«th' computer program 26 is used for computing a amine and ovenvhehning volumes of link line elements. The 

"distance" between any two network addresses that have present invention can detect evasive techniques commonly 

evidently communicated with each other; in U.S Provisional criminals. One such evasion is the purchasing of a 

Patent AppUcation, Ser. No. 60/036,689, filed Jan. 31, 1997, cellular phone, using it for a week, and then discarding it. 

titled DATABASE ORIGAMI, incorporated herein by Prior art link analysis has reached a dead end with large 

referene, this program is named COMPDESC. Given M, 45 databases because the technique leads to an overabundance 

which is defined as the maximum number of communica- of which seriou.sly Hmits the effectiveness of useful 

tions between any two telephone numbers in a given cluster, information extraction. 

the distance between a telephone number p, and a telephone alternative embodiments of the present invention, the 

number p is defined by second databases 12 and 14 comprise between them 

^ 50 at least one of telephone company toll data and dialed- 

^ number-recorder (DNR) records, retail and wholesale sales 

di,j = — , register transaction records, credit card transaction records, 

"'•^ internet packet routing data, e-mail routing information, 

caller-ID data captures, and cellular telephone cell- 
where Ufj is the number of times telephone number p, 55 switching and call-routing information. In general, the infor- 
communicates with telephone number pj. If n,-j=0, the dis- mation deposited to the first database 12 is voluminous and 
tance between telephone number p,- and telephone number has a low probability, but not a zero probability, of com- 
py, is not defined. Therefore, the more p,- and pj prising at least one connection address match with the 
communicate, the shorter will be the "distance** between second database 14. For example, the first database 12 may 
them. Such measure of "distance" has nothing to do with the 60 comprise all the telephone company toll data and dialed- 
real physical geographical distances between telephone sub- number-recorder records collected by an automatic billing 
scribers, computer for an entire telephone company switching office, 

A seventh computer program 28, called PLACE in U.S area code, country or group of countries. 

Provisional Patent Application, Ser. No. 60/036,689, filed Another embodiment of the present invention consists of 

Jan. 31, 1997, titled DATABASE ORIGAMI, incorporated 65 a procedure for extracting useful information from volumi- 

herein by referene, is used for geometrically mapping the nous computer files such as may be obtained firom computer 

network addresses to points in a mapping space plane or seizures by law enforcement agencies^ downloaded research 
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files, medical databases, legal research files, news reports, 
and many others. This embodiment involves an application- 
specific metric defining a distance between any two files 
based upon content. Once these distances have been defined, 
the invention provides a raped procedure for stmcturing, 5 
arranging, and visuafizing the relationships among the data 
sets. 

Although particular embodiments of the present invention 
have been described and iUustrated, such is preferably not 
intended to limit the present invention. Modifications and 10 
changes will no doubt become apparent to those skilled in 
the art, and it is preferably intended that the present inven- 
tion only be limited by the scope of the appended claims. 

The present invention claimed is: 

1. A database tool for hosting on a computer with data 15 
memory storage for databases and an execution unit for 
software programs, comprising: 

a first database that includes connection logs and records 
of individual communication-network subscriber with 
a corresponding communication-network-subscriber ^ 
address; 

a second database of data elements representing 
communication-network-subscribcr addresses under a 
user's scrutiny; 

first means for identifying each unique communication- 
network-subscriber address in the first database; 

second means for determining each of a plurality of 
communication-network-subscriber addresses in the 
second database of communication-network-subscriber 30 
addresses that are also included in the first database; 

third means for parsing said communication-network- 
subscriber addresses into connected component clus- 
ters; 

fourth means for computing a valence value for each 35 
communication-netwodc-subscriber address; 

fifth means for computing a "distance" between any two 
communication-network-subscriber addresses that 
have evidently communicated with each other; 

sixth means for geometrically mapping said 
communication-network-subscriber addresses to points 
in a mapping space plane or volume relative to said 
distance computed by the fifth means; and 

seventh means for plotting and displaying said mapping 
plane or space to a user. 

2. The database tool of claim 1, further comprising: 

an interactive interface for said user to manipulate each of 
the first and second databases, and the first through 
seventh means, in order to extract useful information. 50 

3. The database tool of claim 2, further comprising: 

a third database that includes biographical information 
about persons or products associated with particular 
connection addresses in the second database and con- 
nected to provide such biographical information to the 55 
interactive interface for selectively informing said user 
of previously determined data about each said person or 
product associated with said particular connection 
address. 

4. The database tool of claim 1, wherein: go 
the first and second databases comprise between them at 

least one of telephone company toll data and dialed- 
number-recorder records, sales register transaction 
records, credit card transaction records, internet packet 
routing data, e-mail routing information, caller-ID data 65 
captures, and cellular telephone cell-switching and 
call-routing information. 
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5. The database tool of claim 1, wherein: 

the first means for identifying each unique 
communication-network-subscriber address in the first 
database comprises a computer-implemented software 
program method hosted on a computer that generates 
two files of unique sets of telephone numbers contained 
in the first database, a first such file includes said 
unique telephone numbers sorted in numerical order 
with a frequency count of the number of calls in which 
that phone number was either the caller or callee, and 
a second such file with the same information but the 
data is sorted by the frequency coimt 

6. The database tool of claim 1, wherein: 

the second means for determining each of a plurality of 
oommunication-network-subscriber addresses in the 
second database of communication-network-subscriber 
addresses that are also included in the first database 
comprises a computer-implemented software program 
method hosted on a computer that compares telephone 
numbers in the first database with those in the second 
database, and that outputs a file of the telephone 
numbers common to both the first and second data- 
bases. 

7. The database tool of claim 1, wherein: 

the third means for parsing said communication-network- 
subscriber addresses into connected-component clus- 
ters comprises a computer-implemented software pro- 
gram method hosted on a computer that finds 
communication connection associations that exist 
amongst a plurality of telephone numbers recorded in 
the first database, wherein each tel^hone number that 
was connected to another telephone number at least 
once is isolated into a single connected-component 
cluster in which every member of the connected- 
component cluster can be chained to all of the others by 
calling-telephone number, called-telephone number, or 
both. 

8. The database tool of claim 1, wherein: 

the fourth means for computing a valence value for each 
communication-network-subscriber address comprises 
a computer-implemented software program method 
hosted on a computer that determines the total number 
of other connection addresses with which a particular 
connection address communicates, either as a caller or 
a callee, and that represents the outcome of such 
determination with a valance value. 

9. The database tool of claim 1, wherein: 

the fifth means for computing a "distance" between any 
two communication-network-subscriber addresses that 
have evidently communicated with each other com- 
prises a computer-implemented software program 
method hosted on a computer that assigns imaginary 
relatively scaled distances between points, which rep- 
resent individual communication addresses in an 
imaginary plane or space, that are related to the number 
of times each such communication-network-subscriber 
addresses have had a communication connection 
recorded with another represented communication- 
network-subscriber address according to communica- 
tion connection information included in the first data- 
base. 

10. The database tool of claim 1, wherein: 

the sixth means for geometrically mapping said 
communication-network-subscriber addresses includes 
a host computer with a display monitor or printer for 
representing a plurality of points^, which each represent 
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a single communication-network-subscriber address, in 
a mapping space plane or volume relative to said 
distance computed by the fifth means. 
11. A database tool, comprising: 

a host a computer with data memory storage for databases 
and an execution unit for software programs; 

a first database that includes connection logs and records 
of individual communication-network-subscriber 
addresses that includes at least one of telephone com- 
pany toll data and dialed-number-recorder records, 
sales register transaction records, credit card transac- 
tion records, internet packet routing data, e-mail rout- 
ing information, caller-ID data captures, and cellular 
telephone ceU-switching and call-routing information; 

a second database of data elements representing 
communicatioo-network-subscriber addresses under a 
user's scrutiny and that includes at least one of tele- 
phone company toU data and dialed-number-recorder 
records, sales register transaction records, credit card 
transaction records, internet packet routing data, e-mail 
routing information, caller-ID data captures, and cel- 
lular telephone cell-switching and call-routing infor- 
mation; 

first means for identifying each unique communication- 2S 
network-subscriber address in the first database com- 
prises a computer-implemented software program 
method hosted on a computer that generates two files of 
unique sets of telephone numbers contained in the first 
database, a first such file includes said unique telephone 30 
numbers sorted in numerical order with a frequency 
count of the number of calls in which that phone 
number was either the caller or callce, and a second 
such file with the same information but the data is 
sorted by the frequency count; 

second means for determining each of a plurality of 
communication-network-subscriber addresses in the 
second database of communication -network-subscriber 
addresses that are also included in the first database 
comprises a computer-implemented software program 
method hosted on a computer that compares telephone 
numbers in the first database with those in second 
database, and that outputs a file of the telephone 
numbers common to both the first and second data- 
bases; 

third means for parsing said communication-network- 
subscriber addresses into connected-component clus- 
ters comprises a computer-implemented software pro- 
gram method hosted on a computer that finds 
communication connection associations that exist ^0 
amongst a plurality of telephone numbers recorded in 
the first database, wherein each telephone number that 
was connected to another telephone number at least 



35 



40 
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once is isolated into a single connected-component 
cluster in which every member of the connected- 
component cluster can be chained to all of the others by 
calling-telephone number, called-tclephone number, or 
both; 

fourth means for computing a valence value for each 
communication-network-subscribcr address comprises 
a computer-implemented software program method 
hosted on a computer that determines the total number 
of other connection addresses with which a particular 
connection address communicates, either as a caller or 
a callee, and that represents the outcome of such 
determination with a valance value; 

fifth means for computing a ^'distance" between any two 
communication-network-subscriber addresses that 
have evidently communicated with each other com- 
prises a computer-implemented software program 
method hosted on a computer that assigns imaginary 
relatively scaled distances between points, which rep- 
resent individual communication addresses in an 
imaginary plane or space, that are related to the number 
of times each such communication-network-subscriber 
addresses have had a communication connection 
recorded with another represented communication- 
network-subscriber address according to communica- 
tion connection information included in the first data- 
base; 

sixth means for geometrically mapping said 
communication-network-subscriber addresses includes 
a host computer with a display monitor or printer for 
representing a plurality of points^ which each represent 
a single communication-network-subscriber address, in 
a mapping space plane or volume relative to said 
distance computed by the fifth means; 

seventh means for plotting and displaying said mapping 
plane or space to a user; 

a third database that includes biographical information 
about persons or products associated with particular 
connection addresses in the second database and con- 
nected to provide such biographical information to the 
host computer and for selectively informing said user 
of previously determined data about each said person or 
product associated with said particular connection 
address; and 

an interactive interface included in the host computer 
providing for a user to manipulate each of the first and 
second databases, and the first through seventh means, 
and to extract useful information about relationships 
that exist between communicatioa-network-subscriber 
addresses. 
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